Minimizing Maximum Regret in Commitment Constrained Sequential Decision Making
نویسندگان
چکیده
In cooperative multiagent planning, it can often be beneficial for an agent to make commitments about aspects of its behavior to others, allowing them in turn to plan their own behaviors without taking the agent’s detailed behavior into account. Extending previous work in the Bayesian setting, we consider instead a worst-case setting in which the agent has a set of possible environments (MDPs) it could be in, and develop a commitment semantics that allows for probabilistic guarantees on the agent’s behavior in any of the environments it could end up facing. Crucially, an agent receives observations (of reward and state transitions) that allow it to potentially eliminate possible environments and thus obtain higher utility by adapting its policy to the history of observations. We develop algorithms and provide theory and some preliminary empirical results showing that they ensure an agent meets its commitments with history-dependent policies while minimizing maximum regret over the possible environments.
منابع مشابه
Sequential Decision Making with Rank Dependent Utility: A Minimax Regret Approach
This paper is devoted to sequential decision making with Rank Dependent expected Utility (RDU). This decision criterion generalizes Expected Utility and enables to model a wider range of observed (rational) behaviors. In such a sequential decision setting, two conflicting objectives can be identified in the assessment of a strategy: maximizing the performance viewed from the initial state (opti...
متن کاملEfficient Constrained Regret Minimization
Online learning constitutes a mathematical and compelling framework to analyze sequential decision making problems in adversarial environments. The learner repeatedly chooses an action, the environment responds with an outcome, and then the learner receives a reward for the played action. The goal of the learner is to maximize his total reward. However, there are situations in which, in additio...
متن کاملRelationship Decision-Making as a Mediator between Regret, Autonomy, and Two Forms of Relationship Commitment: Dedication and Constraint
RELATIONSHIP DECISION-MAKING AS A MEDIATOR BETWEEN REGRET, AUTONOMY, AND TWO FORMS OF RELATIONSHIP COMMITMENT: DEDICATION AND CONSTRAINT Ashley M. A. Fehr Old Dominion University, 2015 Director: Dr. James M. Henson This study examined the relationships among autonomy, anticipated regret, decision-making, and dedication and constraint commitment of college students in romantic relationships. Two...
متن کاملk-Regret Minimizing Set: Hardness and Efficient Algorithms
We study the k-regret minimizing query (k-RMS), which is a useful operator of supporting multi-criteria decision-making. Given two integers k and r, a k-RMS returns r tuples from the database which minimize the k-regret ratio, defined as one minus the worst ratio between the k-th maximum utility score among all tuples in the database and the maximum utility score of the r tuples returned. A sol...
متن کاملSafety-Aware Algorithms for Adversarial Contextual Bandit
In this work we study the safe sequential decision making problem under the setting of adversarial contextual bandits with sequential risk constraints. At each round, nature prepares a context, a cost for each arm, and additionally a risk for each arm. The learner leverages the context to pull an arm and receives the corresponding cost and risk associated with the pulled arm. In addition to min...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017